Implementation and New Variants Exploration of the Multi-Perspective Context Matching Deep Neural Network Model for Machine Comprehension
نویسندگان
چکیده
This project explores the multi-perspective context matching method for the task of reading comprehension using the SQuAD data set. The original six layer model presents an interesting system for exploring deep learning architectures and their implementations on Tensorflow.The first step was to design an efficient implementation of this complex model on Tensorflow. The second step, and the aim of this project, was to devise new model variants to potentially improve the model quality and provide helpful insights to researchers. The main variants that were experimented with were: bi-directional GRU’s instead of bi-directional LSTM’s, learning rate annealing, effect of gradient clipping, Tanimoto coefficient for the filter layer and model ensembles, hyper-parameter and search and regularization. Some interesting insights are gained when implementing these variants and the results and observations are described in the paper. 1 Background Work and Introduction The task of developing machine comprehension question answering systems are compelling tasks in the field of natural language processing. The advent of state of the art deep learning architectures and the introduction of the SQuAD dataset has provided a platform for developing deep learning architectures using a big dataset like SQuad and giving rise to promising systems. One of the state of the art systems that is ranked third on the SQuAD leaderboard is the BiDAF (Bi Directional Attention Flow) network [4] that employs character level, word-level and contextual embeddings along with a bi-directional attention flow. This paper provided us a nice explanation of how to obtain a queryaware context representation and how a multi-stage architecture for a question answering system can be developed. This paper was an entry point for us to other architectures and attention methods. Another interesting system was the DCN(dynamic co-attention networks) [2] which proposes a co-attention method that attends to the question and the document simultaneously and finally uses both attention contexts. Such a co-attentive encoder can capture interesting interactions between the question and the document and gave us insights about another interesting attention network. At this point it became quite clear to us that an effective attention network is the heart of a machine comprehension system and would form an integral part of whichever system we plan to design or implement. We decided to implement the Multi Perspective Context Matching [1] system which is based on the idea of predicting the answer span by matching the context of each point in the passage with the question from multiple perspectives , the assumption for doing this being that the span in the passage is more likely to be the correct answer if the context of this span is very similar to the question.
منابع مشابه
A new model for Machine Comprehension via multi-perspective context matching and bidrectional attention flow
To answer a question about a context paragraph, there needs to be a complex model for interactions between these two. Previous Machine Comprehension (MC) where either not large enough to train end-to-end deep neural networks, or not hard to learn. Recently, after the release of SQuAD dataset dataset, several adept models have been proposed for the task of MC. In this work we try to combine the ...
متن کاملAssignment 4: Question Answering on the SQuAD Dataset with Part-of-speech Tagging
This research applies deep learning with bi-LSTMs to train a model that responds to queries on the Stanford Question Answering Dataset (SQuAD). The model design was motivated by Wang et. al.’s December 2016 IBM Research paper on multi-perspective context matching for machine comprehension. Using TensorFlow, we implemented a multi-layer neural network architecture utilizing bi-directional LSTMs ...
متن کاملNeuron Mathematical Model Representation of Neural Tensor Network for RDF Knowledge Base Completion
In this paper, a state-of-the-art neuron mathematical model of neural tensor network (NTN) is proposed to RDF knowledge base completion problem. One of the difficulties with the parameter of the network is that representation of its neuron mathematical model is not possible. For this reason, a new representation of this network is suggested that solves this difficulty. In the representation, th...
متن کاملMulti-Perspective Context Matching for Machine Comprehension
Previous machine comprehension (MC) datasets are either too small to train endto-end deep learning models, or not difficult enough to evaluate the ability of current MC techniques. The newly released SQuAD dataset alleviates these limitations, and gives us a chance to develop more realistic MC models. Based on this dataset, we propose a Multi-Perspective Context Matching (MPCM) model, which is ...
متن کاملThree New Systematic Approaches for Computing Heffron-Phillips Multi-Machine Model Coefficients (RESEARCH NOTE)
This paper presents three new systematic approaches for computing coefficient matrices of the Heffron-Phillips multi-machine model (K1, …, K6). The amount of computations needed for conventional and three new approaches are compared by counting number of multiplications and divisions. The advantages of new approaches are: (1) their computation burdens are less than 73 percent of that of convent...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017